Automatic Exploitation of Input Parallelism

نویسندگان

  • TAEWOOK OH
  • Taewook Oh
چکیده

Parallelism may reside in the input of a program rather than the program itself. A script interpreter, for example, is hard to parallelize because its dynamic behavior is unpredictable until an input script is given. Once the interpreter is combined with the script, the resulting program becomes predictable, and even parallelizable if the input script contains parallelism. Despite recent progress in automatic parallelization research, however, existing techniques cannot take advantage of the parallelism within program inputs, even when the inputs remain fixed across multiple executions of the program. This dissertation shows that the automatic exploitation of parallelism within fixed program inputs can be achieved by coupling program specialization with automatic parallelization techniques. Program specialization marries a program with the values that remain invariant across the program execution, including fixed inputs, and creates a program that is highly optimized for the invariants. The proposed technique exploits program specialization as an enabling transformation for automatic parallelization; through specialization, the parallelism within the fixed program inputs can be materialized within the specialized program. First, this dissertation presents Invariant-induced Pattern-based Loop Specialization (IPLS). IPLS folds the parallelism within the program invariants into the specialized program, thereby creating a more complete and predictable program that is easier to parallelize. Second, this dissertation applies automatic speculative parallelization techniques to specialized programs to exploit parallelism in inputs. As existing techniques fail to extract parallelism from complex programs such as IPLS specialized programs, context-sensitive speculation and optimized design of the speculation run-time system are proposed to improve the applicability and minimize the execution overhead of the parallelized program. A prototype of the proposed technique is evaluated against two widely-used opensource script interpreters. Experimental results demonstrate the effectiveness of the proposed techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Exploitation of Parallelism on Pentium III and Pentium 4 Processor-Based Systems

Systems based on the Pentium III and Pentium 4 processors enable the exploitation of parallelism at a fineand medium-grained level. Dualand quad-processor systems, for example, enable the exploitation of mediumgrained parallelism by using multithreaded code that takes advantage of multiple control and arithmetic logic units. Streaming Single-Instruction-Multiple-Data (SIMD) extensions, on the o...

متن کامل

Automatic Parallelism Exploitation for FPL-Based Accelerators

The paper introduces the first complete programming framework for coarse grain dynamically reconfigurable accelerators and their application development. It includes a general model for cooperating host/accelerator platforms and a parallelizing compilation technique derived from it. The paper is an introduction illustrating these techniques and their principles by examples: a machine architectu...

متن کامل

Parallelization in Co-Compilation for Configurable Accelerators - A Host / Accelerator Partitioning Compilation Method

Fig. 1: Makimoto’s wave: summarizing the history of paradigm shifts in semiconductor markets. the future? Abstract— The paper introduces a novel co-compiler and its “vertical” parallelization method, including a general model for co-operating host/accelerator platforms and a new parallelizing compilation technique derived from it. Small examples are used for illustration. It explains the exploi...

متن کامل

Towards Automatic Profile-Driven Parallelization of Embedded Multimedia Applications

Despite the availability of ample parallelism in multimedia applications parallelizing compilers are largely unable to extract this application parallelism and map it onto existing embedded multi-core platforms. This is mainly due to the limitations of traditional autoparallelization on static analysis and loop-level parallelism. In this paper we propose a dynamic, profile-driven approach to au...

متن کامل

MUSKEL: an expandable skeleton environment

Programming models based on algorithmic skeletons promise to raise the level of abstraction perceived by programmers when implementing parallel applications, while guaranteeing good performance figures. At the same time, however, they restrict the freedom of programmers to implement arbitrary parallelism exploitation patterns. In fact, efficiency is achieved by restricting the parallelism explo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015